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Abstract 

An analytical relationship between the statistical significance of an ob- 
served signal and the signal width in case of a large background was obtained. 
It can help to explain why high-energy experiments may have different conclu- 
sions on the existence of new particles. We illustrate our approach using the 
experimental data on searches for the 0"'"(153O) pentaquark state. The ob- 
tained relationship is also useful for planning of future experiments designed 
to search for signals of new particles in invariant-mass distributions. 



1 Introduction 



With increase of the energy of colhding beams, high-energy experiments have to deal 
with the problem of a large combinatorial background which makes searches for new 
narrow states^ in invariant- mass distributions to be rather difficult. This was the 
case with the top-quark discovery, searches for new charmonium states or pentaquark 
searches. Usually, observed signals do not have high statistical significance, and the 
decision has to be made about whether the observed peak has a sufficient significance 
to claim the observation, or it should be disregarded. 

The situation with particle searches could be rather complicated and confusing, 
especially when several similar experiments have different conclusions on a signal 
existence. In such cases, instrumental differences between these experiments are 
important to know. For the signal reconstruction in invariant-mass distributions, 
it becomes crucial to understand the influence of experimental resolution on the 
statistical significance of a signal. In this paper, we have obtained a relationship 
between the statistical significance of a signal and its width. For narrow states, 
the peak width is mainly determined by the experimental resolution which can be 
different for different experiments. Therefore, the obtained relationship could clarify 
some cases when several experiments observe a signal, while others do not see it. 

To be more specific, let us remind that two similar coUider experiments at DESY 
(Hamburg) [1,2] and two similar fixed-target experiments at IHEP (Protvino) [3,4] 
do not agree on the existence of an exotic state with the mass close to 1530 McV, 
which may be interpreted as a strange pentaquark (PQ) state G"'"(1530) [5]. While 
the ZEUS [1] and SVD-2 [3] experiments claim the observation of a narrow peak 
near 1530 MeV in the decay mode — > pKg, the HI and SPHINX collaborations 
do no see it. 

It should be mentioned that the pKg decay channel is not exotic, since the 
standard S baryons can decay to this channel as well. However, the observed peak 
position and the narrow width are very close to the theoretical expectation for the 
©"^ state [5]. In addition, a E state near 1530 MeV is unknown [6] and it was never 
seen in previous exclusive reactions where experimental conditions were favorable 
for the observation of the E states. There are several indications that the observed 
particle has different properties from those of the usual baryons produced in quark 
and gluon fragmentation [7, 8] . 

In this paper, we illustrate the infiuence of the track resolution on the statistical 
significance using the pKg decays. Our analysis is rather general and can 

be used to establish a relationship between the width of the observed signal and its 
statistical significance in any experiment where a large background is expected. 

^We define narrow states as the states which have the natural width less than 10~^ • m, where 
m is a particle mass. 
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2 Influence of inactive material on the pK 
reconstruction 



First, let us study the effect of an inactive material in front of a tracking device on 
the reconstructed mass resolution for the B"*" pKg decay channel. Clearly, the 
reconstructed width depends on the quality of a tracking device itself, as well as on 
the software reconstruction. For simplicity of our toy simulation, we ignore these 
facts completely; our set-up represents an ideal case when the mass resolution is 
mainly due to unrecoverable loss of the information on the original particle momenta 
after rescattering on inactive material. For example, a beam pipe and/or the inner 
wall of a tracking device represent an inactive material leading to a degradation 
of the track resolution. In case if a micro-vertex detector is not used as an active 
detector in the reconstruction of the Kg and the proton, the amount of material 
further increases and, as we show below, the penalty is a significant worsening of 
the resolution for the B"*" reconstruction. 

For this study, a Geant [9] simulation was used. We made a simple detector 
set-up which consists of an aluminum target followed by a tracking chamber. Each 
simulated event consists of a single initial B"*" particle with the momentum vector 
perpendicular to the target, which is located 1 cm away from the injection point. 
Only the pKg decay mode was considered. The initial momentum of B^ was 1 GeV. 
In this case, the proton absolute momentum is below 1 GeV, ensuring that the dE/dx 
identification is possible. We assume no energy loss in the tracking simulation. 

Below we will study the dependence of the reconstructed width, on the amount 
of material. We use the word "width" to denote the standard deviation of a Gaussian 
distribution used to describe the mass resolution. 

The mesons from the B"*" decays interact with the aluminum target and then 
decay into t[^tt~ . Our algorithm calculates the invariant mass of two oppositely 
charged tracks to reconstruct the K^. For the protons, any track which is not 
associated with the pions from the is used. Then, K^g is combined with a 
proton track to reconstruct the B"*" mass. Table [T] shows the width w for and 
B"*", when reconstructed tracks are used (the second and third columns), or when 
the true information on the pion momenta is used (the last column). The latter 
reconstruction is used to illustrate the effect of the proton reconstruction to the 
width of B^. Also shown is the width of the B"*" state when the mass is fixed to 
the PDG value [6]. The latter approach was used in most experiments in order to 
improve the mass resolution. 

An additional material in front of a tracking device leads to a significant rescat- 
tering, as illustrated in Fig. [H Obviously, this is reflected in worsening of the pK^ 
resolution. In addition, this may affect the proton-track reconstruction efficiency, 
since a reconstruction program has to make the decision about whether the proton 
track belongs to a "primary-track" category or not (assuming that the B"*" pKg 
is a strong decay). However, without a realistic Monte Carlo simulation, to estimate 
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such an effect is rather difficult. 

Even although our simulation represents a rather simplified situation, ignoring 
many details of a realistic track reconstruction, the resulting width values of 
and 0^ are rather sensible. We illustrate this using two similar experiments at 
HERA, ZEUS and HI. The ZEUS pK^ resolution (from a Monte Carlo simulation) 
is 2.5 ± 0.5 MeV, while the amount of material in front of the central tracking is 
0.3 cm [10] of aluminum. This agrees with 2.93 MeV resolution given in the Tabled] 
(forth column for L = 0.3 cm). 

For the K^, ZEUS fits the tt+tt^ peak using two Gaussian distributions with a 
common peak position. The first Gaussian has the width of about 3.5 MeV and de- 
scribes the main peak, while the second Gaussian with the width of 7 MeV describes 
the tails (which typically contain about 27% of all reconstructed candidates) [1]. 
Thus, the total mass resolution is 4 — 5 MeV and it can reasonably be approximated 
by the first Gaussian. As a cross check, the Half Width at Half Maximum (HWHM) 
was calculated using the plot shown in [1] to restore the Gaussian width of the signal. 
The width was found to be around 4 MeV, and agrees with the single- Gaussian width 
for Kg given in a previous ZEUS publication [11]. The ZEUS Kg mass resolution 
reasonably agrees with the 4.23 MeV width given in Tabled] for L = 0.3 cm. 

In case of HI, the Kg mass peak has the width of 8 MeV, two times larger than 
for ZEUS. This was obtained by analysing the HWHM of the Kg mass distribution 
shown in [2]. The quoted RMS for the tt+tt^ mass spectrum is 9.2 MeV [12]. This 
width is equivalent to twice larger amount of inactive material, as it follows from 
Table dJ Indeed, HI has an additional detector in front of the central tracking [2] 
which was absent in ZEUS case. 

Table [2] demonstrates the relationship between the mass resolutions and the 
conclusions from the searches in the experiments which have studied identical 
reactions. It is obvious that the experiments with positive evidence for the 
state have better tracking resolution^ than those which have null evidence. The Kg 
resolution can be translated to the resolution expected for the B"^ state using Tabled! 
For example, for the quoted above values, the resolutions of B"*" are 2.9 MeV and 
5-6 MeV for ZEUS and HI case, respectively. These numbers from our simulation 
agree with those quoted in ZEUS and HI papers [1,2]. 

It is conceivable that the different conclusions on the existence of the B"*" state 
may be related to differences in the mass resolution, which determines the observed 
peak width in case of narrow resonances. Below we will study the dependence of 
the statistical significance of the signal observation on its width in case of a large 
background. 

■^ZEUS has also higher statistics for the pKg combinations compared to HI. 
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3 Statistical significance and the peak width 



There are several ways to define the statistical significance, S, of an observed signal. 
We will use the most practical definition, which is often used in experimental papers. 
We define S through the ratio of the total number of reconstructed signal entries, N, 
divided by its error 5N, i.e. as = N/SN. The mimbcrs N and SN can be found 
from a fit using a Gaussian plus a background function. The standard deviation, cr, 
provides a best estimation for the statistical error 6N [13] and therefore often the 
statistical significance is quoted in the form S ■ a. We do not use the definition of 
S in terms of the probability of the background to fluctuate to a "signal" with a 
certain number of observed events, since such a definition would require a significant 
computational time for the studies to be discussed below. 

The statistical significance of a signal observation depends on several factors, and 
the most crucial are: 1) the production cross section; 2) the background level under 
the peak; 3) experimental resolution; 4) the shape of the background. Below we 
will consider somewhat simplified situation, assuming that a Gaussian-like signal is 
located on a smoothly falling convex-like background, since this is the most common 
situation for many experiments. The location of the signal is assumed to be known. 
The case when the signal is expected on a background hump is more difficult and, 
in some cases, could be avoided by changing selection cuts. 

The statistical significance S is usually expressed via the numbers of the signal 
events and the background events. Thus, such a definition docs not contain explicitly 
the signal width. Therefore, we will use different variables for our analysis. We define 
p^^^ as the density of the background at the mass region where the signal is expected. 
It is calculated as the number of combinations at the expected location of the peak 
position divided by the bin width. The p^^^ does not depend very strongly on the 
shape of the background in the signal region, assuming that the signal is narrow and 
the background is sufficiently smooth. This variable is determined by the available 
statistics and combinatorial background contributing to the invariant mass. 

Another independent variable, /, is the ratio of the number of expected signal 
events Ng over the background density, / = Ng/p^^^. This variable can be calculated 
from the expected signal cross section, available luminosity and acceptance. It 
should be noted that we do not use the definition of the fraction as the number of 
signal events divided by the number of background events under the peak. In this 
case, such a definition will have a dependence on the peak width, w. This has to be 
avoided; the peak width will be another independent variable. 

Our task is to obtain a relationship between S and three independent variables, 
p^^^, f and w. For this, we have carried out several Monte Carlo "experiments" by 
generating background distributions combined with a Gaussian signal with a fixed 
peak position at 1520 MeV. The shape of the background was taken from [1,2]; in 
fact, such a threshold shape is rather typical for many searches. The width of the 
Gaussian distribution was varied between 3 MeV and 14 MeV. The fraction of the 
signal events, /, and the background density p^^^ were also varied in a wide range. 
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For each generated distribution with the background plus the Gaussian signal, the 
statistical significance was estimated by fitting the peak using exactly the same 
functions as those used to generate the mass spectra. To reduce the number of 
unstable fits, the peak position was fixed to the expected value 1520 MeV during 
the fit procedure. 

Figure [2] shows the mass distributions simulated using a threshold function plus 
a Gaussian with the peak widths 3, 6 and 9 MeV, respectively. The statistical 
significance, under the assumption that the location of the signal is known, was 
estimated by performing a Gaussian fit plus the threshold function. The fraction 
of the signal events, as well as the total number of the background events, was the 
same in all cases. It is seen that the statistical significance decreases with increase 
of the peak width; for w = 3 MeV, one can claim a discovery, while for w = 9 MeV, 
the statistical significance of the observation is low. 

Figure [3] shows the calculated statistical significance as a function of / and p^^^ 
for 3 MeV and 9 MeV widths, respectively. The statistical significance was estimated 
as for Fig. [2|, i.e. using a fit with the Gaussian distribution to extract the number 
of the signal events. As expected, 5* increases with increase of / and p^^^. The 
irregularities seen in Fig. [3] are due to unstable fits, when the resulting Gaussian 
width is different from the expected width by a factor two. In such cases, the 
statistical significance was set to zero. The fraction of unstable fits was ~ 6%, and 
it increases at small p^^^ and /. 

For a fixed peak width w, the statistical significance as a function of the variables 
/ and p^^^ can be fitted using the function: 

5 = Pl/+P2/p''*^. (1) 

Figure [3] shows the fit results using the above function (bottom plots). The function 
gives a reasonable fit with typical values for the x^/ndf around 0.9-1.4. 

Next, pi and p2 as functions of w were fitted using a second order polynomial. 
This ultimately leads to the following parameterisation of the statistical significance 
as functions of /, p^^^ and w 

S = co + CiW + C2w\ c, = f{ai + fo.p^'^s) (2) 

where Oq = 1440, ai = -61000, a2 = 0, 6o = 0.0115, bi = -1.1 and 62 = 44 (here we 
drop unites for simplicity). This parameterisation is expected to be correct within 
20% accuracy in a region close to the usual discovery threshold of 5cr, and when 
w is much smaller than the convex radius of the background shape. For very large 
values of 5 or a very complex background shape it may fail. Using Eq. ([2]), one can 
easily reproduce the significance numbers shown in Figure [H 

Thus, for a known background density, the expected signal fraction (or cross sec- 
tion) and the expected signal width (which could be due to the detector resolution, 
a natural peak width or combinations of both), one can predict the expected sig- 
nal significance for a convex-like background. This does not require a Monte Carlo 
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simulation. 



4 Analysis of similar experiments searching 
for the pK^g bump 

The previous analysis can directly be applied to the experiments searching for the 
G"*" state. Let us consider the case when the production cross sections are expected 
to be the same, while two experiments have different conclusions on the existence of 
the state, as in case of the ZEUS and HI experiments. The ZEUS experiment [1] 
observes a narrow peak near 1522 GeV, while HI does not see it [2]. In ZEUS case, 
pbkg _ 270/0.005 GeV~^ and / = 155/54000 GeV, assuming the most conservative 
single-Gaussian fit which does not take into account the background shape near the 
peak^. The observed 6+ width was ~ 5 MeV (see Table 1 in [1]). 

At this moment, it is only possible to compare ZEUS and HI resolutions from 
Monte Carlo simulations. For ZEUS, the pK^ resolution is 2 MeV [1], while it is close 
to 6 MeV [2,14] near 1.52 MeV for HI. For a conservative estimate, we assume that 
the Monte Carlo simulation turns to underestimate the resolution and the 5 MeV 
width observed by ZEUS is totally accounted for by the tracking resolution (the 
natural width of is tiny in this case). Using the above input values and the 
parameterisation Eq. ([2]), one obtains the statistical significance of 4.2 a. This is 
close to the statistical significance for the 9+ quoted in the ZEUS paper, taken into 
account the experimental uncertainty on the extracted width. 

In HI case, width is a factor two larger than in ZEUS case, and this can lead 
to the pKl mass resolution close to 10 MeV (see Table 1). Again, assuming that 
has a tiny width, this resolution number should define the peak width. If one sets 
the background density as in HI case, p ^ = 220/0.005 GeV~^ [2], while keeping 
/ as in the ZEUS case, one obtains ~ 2.9 a, which is low statistical significance 
observation. It has to be also noted that the fraction of unstable fits is around 10% 
for the width of 10 MeV, while this fraction is 6% in case of 5 MeV width. This 
means that the fit will not converge at all in 10% cases, even when the peak position 
is fixed to the expected value during the Gaussian fit. 

When HI uses the ZEUS cut on the proton momentum p < 1.5 MeV to increase 
the proton purity [2], this leads to a significant decrease in the available statistics 
for HI: In this case, p°^^ = 70/0.005 GeV~^. With the same / as in ZEUS case, one 
obtains the statistical significance of 2.5 a in accordance with Eq. ([2]). This value is 
similar to that estimated for the larger p^^^. The small sensitivity of the significance 
to the background density is due to a smallness of the second term in Eq. ([1]) when 
p^'^s < 60000 GeV^^ 

Thus, the G"*" signal can easily be missed in HI case, even in the simplest case 
when the expected cross section, detector acceptance and statistics are the same as 

■^The double-Gaussian fit leads to a larger number of candidates and to a higher statistical 
significance. 
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for ZEUS. Indeed, the limit on the G"*" cross section which has been set by HI [2] is 
similar to the 9"*" cross section measured by ZEUS [15]. 

The situation could be different if the reconstructed width is mainly deter- 
mined by the natural width of B"*". In this case, the differences between the HI and 
ZEUS tracking are not so important. 

A similar consideration is applicable for the SVD-2 and SPHINX experiments 
[3,4]. The SVD-2 experiment has the mass resolutions almost factor of two better 
than SPHINX, see Table [2l Therefore, if all other experimental conditions are 
similar, the statistical significance for the B"*" state is expected to be higher for the 
SVD-2 experiment. 

5 Conclusions 

We have shown how the width of a Gaussian-like signal is related to the observed 
statistical significance in case of a large convex-like background. This observation 
has a direct consequence for many experiments if the reconstructed width of the 
signal is mainly determined by the tracking resolution. With the exception of e^e~ 
colliding experiments in which the baryon production is expected to be more sup- 
pressed compared to proton- initiated reactions, at present, there are two groups of 
similar experiments which have opposite conclusions on the existence of B"*" state 
in the pK'^g channel: HI and ZEUS (at DESY) [1,2] and SVD-2 and SPHINX (at 
IHEP) [3,4]. We have shown that the ZEUS and SVD-2 experiments, which observe 
a peak near 1530 MeV, have better mass resolutions for ^^^^(A) and pKg invariant 
masses than the HI and SPHINX experiments. Taking into account the observed 
relationship between the signal width and the tracking resolution, this fact increases 
the chances of ZEUS and SVD-2 to find the narrow signal. We have shown that the 
worsening of the tracking resolution for the B"*" searches is the penalty the exper- 
iments have to pay if a significant inactive material is located in front of tracking 
devices. 

It has to be noted that this approach cannot be applied blindly to all experiments 
which observe or do not observe the B"^ signal, since such experiments may have 
very different acceptance, background shape and the production mechanism in the 
reaction under study. The approach could help to explain differences in experimental 
results if the kinematics and the production cross section are known to be similar. 
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L (cm) 


w{Ks), MeV 


w{e+), MeV 


^(e+), MeV 


^(e+), MeV 


0.3 


4.23 


3.84 


2.93 


1.38 


0.4 


5.10 


4.89 


3.38 


1.57 


0.5 


5.79 


5.79 


3.99 


1.82 


0.6 


6.31 


6.78 


4.32 


1.84 


0.7 


6.60 


7.43 


5.49 


1.90 


0.8 


7.69 


8.82 


5.19 


2.05 


0.9 


8.05 


9.65 


5.95 


2.09 


1.0 


8.33 


11.46 


5.97 


2.28 



Table 1: The reconstructed width (see definition in the text) of Kg and as a 
function of the thickness, L, of an aluminum target (in cm). The reconstruction of 
was performed by three different methods: 1) Kg is reconstructed by using tracks 
(second column) and combined with the proton track {w{Q^), third column). 2) 
Pion tracks from a Kg decay are combined with the proton track, while the pKg was 
calculated assuming the nominal Kg mass (w(0+), fourth column). 3) True charged 
pions from Kg and the proton track are combined (w(9^), last column). The 
reconstruction was performed using a Geant simulation with the initial momenta 
of e+ of 1 GeV. 



Experiment 


reaction 


width, MeV 


G"*" signal 


ZEUS [11] 
HI [12] 


ep^KsP + X, ^=300 GeV 


w{Kl) = 4.0 
w{A) = 1.9 
w{K'^g) = 9.2 
w{A) = 2.9 


Yes 
No 


SVD-2 [3] 
SPHINX [4] 


pA KsP + X, 70 GeV 


w{K^g) = 4.40± 0.08 
w(A) = 1.60± 0.04 
w{K°g) = 8.4 
w{A) = 3.8 


Yes 
No 



Table 2: A compilation of Kg and A mass resolutions, w, in the two groups of 
experiments with opposite conclusions on the existence. The width w denotes 
the standard deviation of a Gaussian distribution (see the text). 
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Figure 1: The distributions of x, y and z components of the proton momentum 
difference between the true and the reconstructed proton momentum, Apj = Ip^^l ~ 
\pr\, after interactions of the protons from the ©"^ decay on an aluminum target of 
thickness L =0.3 cm and 1.0 cm. The distributions were obtained using a Geant 
simulation. Here ptr is the true momentum of the proton originated from a decay 
and Pr is the proton momentum reconstructed after the target; d) distributions of 
the polar angle between the momenta ptr and Pr- 
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Figure 2: The generated mass spectra using a threshold function plus a Gaussian 
signal at 1520 MeV. The total number of simulated background events is 65000, 
pbkg ^ 617/0.0025 = 246800 GeV"^ and / = 0.0017 GeV. The widths of the gen- 
erated signals were set as indicated in the figure. Also shown are the statistical 
significances S of the extracted signals using a fit with a Gaussian plus a threshold 
background. 
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Figure 3: The reconstructed statistical significance of a Gaussian signal as a 
function of the signal fraction / and the background density p^^^ under the signal 
peak for the signal width of 3 and 9 MeV, respectively. The observed wrinkles are due 
to unstable fits {S was set to zero in such cases). Bottom: 2D fits of the distributions 
shown above using the fit function pif + P2f p°^^- Bins with zero numbers of entries 
were not included. 
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